Search CORE

8 research outputs found

Applications in Monocular Computer Vision using Geometry and Learning : Map Merging, 3D Reconstruction and Detection of Geometric Primitives

Author: Gillsjö David
Publication venue: Lund University / Centre for Mathematical Sciences /LTH
Publication date: 01/01/2023
Field of study

As the dream of autonomous vehicles moving around in our world comes closer, the problem of robust localization and mapping is essential to solve. In this inherently structured and geometric problem we also want the agents to learn from experience in a data driven fashion. How the modern Neural Network models can be combined with Structure from Motion (SfM) is an interesting research question and this thesis studies some related problems in 3D reconstruction, feature detection, SfM and map merging.In Paper I we study how a Bayesian Neural Network (BNN) performs in Semantic Scene Completion, where the task is to predict a semantic 3D voxel grid for the Field of View of a single RGBD image. We propose an extended task and evaluate the benefits of the BNN when encountering new classes at inference time. It is shown that the BNN outperforms the deterministic baseline.Papers II-III are about detection of points, lines and planes defining a Room Layout in an RGB image. Due to the repeated textures and homogeneous colours of indoor surfaces it is not ideal to only use point features for Structure from Motion. The idea is to complement the point features by detecting a Wireframe – a connected set of line segments – which marks the intersection of planes in the Room Layout. Paper II concerns a task for detecting a Semantic Room Wireframe and implements a Neural Network model utilizing a Graph Convolutional Network module. The experiments show that the method is more flexible than previous Room Layout Estimation methods and perform better than previous Wireframe Parsing methods. Paper III takes the task closer to Room Layout Estimation by detecting a connected set of semantic polygons in an RGB image. The end-to-end trainable model is a combination of a Wireframe Parsing model and a Heterogeneous Graph Neural Network. We show promising results by outperforming state of the art models for Room Layout Estimation using synthetic Wireframe detections. However, the joint Wireframe and Polygon detector requires further research to compete with the state of the art models.In Paper IV we propose minimal solvers for SfM with parallel cylinders. The problem may be reduced to estimating circles in 2D and the paper contributes with theory for the twoview relative motion and two-circle relative structure problem. Fast solvers are derived and experiments show good performance in both simulation and on real data.Papers V-VII cover the task of map merging. That is, given a set of individually optimized point clouds with camera poses from a SfM pipeline, how can the solutions be effectively merged without completely resolving the Structure from Motion problem? Papers V-VI introduce an effective method for merging and shows the effectiveness through experiments of real and simulated data. Paper VII considers the matching problem for point clouds and proposes minimal solvers that allows for deformation ofeach point cloud. Experiments show that the method robustly matches point clouds with drift in the SfM solution

Lund University Publications

Moving object detection in urban environments

Author: Gillsjö David
Publication venue: Linköpings universitet, Tekniska högskolan
Publication date: 01/01/2012
Field of study

Successful and high precision localization is an important feature for autonomous vehicles in an urban environment. GPS solutions are not good on their own and laser, sonar and radar are often used as complementary sensors. Localization with these sensors requires the use of techniques grouped under the acronym SLAM (Simultaneous Localization And Mapping). These techniques work by comparing the current sensor inputs to either an incrementally built or known map, also adding the information to the map.Most of the SLAM techniques assume the environment to be static, which means that dynamics and clutter in the environment might cause SLAM to fail. To ob-tain a more robust algorithm, the dynamics need to be dealt with. This study seeks a solution where measurements from different points in time can be used in pairwise comparisons to detect non-static content in the mapped area. Parked cars could for example be detected at a parking lot by using measurements from several different days.The method successfully detects most non-static objects in the different test datasets from the sensor. The algorithm can be used in conjunction with Pose-SLAM to get a better localization estimate and a map for later use. This map is good for localization with SLAM or other techniques since only static objects are left in it

Publikationer från Linköpings universitet

In Depth Bayesian Semantic Scene Completion

Author: Gillsjö David
Åström Kalle
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Lund University Publications

Semantic Room Wireframe Detection from a Single View

Author: Flood Gabrielle
Gillsjö David
Åström Kalle
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/08/2022
Field of study

Reconstruction of indoor surfaces with limited texture information or with repeated textures, a situation common in walls and ceilings, may be difficult with a monocular Structure from Motion system. We propose a Semantic Room Wireframe Detection task to predict a Semantic Wireframe from a single perspective image. Such predictions may be used with shape priors to estimate the Room Layout and aid reconstruction. To train and test the proposed algorithm we create a new set of annotations from the simulated Structured3D dataset. We show qualitatively that the SRW-Net handles complex room geometries better than previous Room Layout Estimation algorithms while quantitatively out-performing the baseline in non-semantic Wireframe Detection

Lund University Publications

Minimal Solvers for Point Cloud Matching with Statistical Deformations

Author: Flood Gabrielle
Gillsjö David
Heyden Anders
Tegler Erik
Åström Kalle
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

An important issue in simultaneous localisation and mapping is how to match and merge individual local maps into one global map. This is addressed within the field of robotics and is crucial for multi-robot SLAM. There are a number of different ways to solve this task depending on the representation of the map. To take advantage of matching and merging methods that allow for deformations of the local maps it is important to find feature matches that capture such deformations. In this paper we present minimal solvers for point cloud matching using statistical deformations. The solvers use either three or four point matches. These solve for either rigid or similarity transformation as well as shape deformation in the direction of the most important modes of variation. Given an initial set of tentative matches based on, for example, feature descriptors or machine learning we use these solvers in a RANSAC loop to remove outliers among the tentative matches. We evaluate the methods on both synthetic and real data and compare them to RANSAC methods based on Procrustes and demonstrate that the proposed methods improve on the current state-of-the-art

Lund University Publications

Generic Merging of Structure from Motion Maps with a Low Memory Footprint

Author: Flood Gabrielle
Gillsjö David
Heyden Anders
Persson Patrik
Åström Kalle
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

With the development of cheap image sensors, the amount of available image data have increased enormously, and the possibility of using crowdsourced collection methods has emerged. This calls for development of ways to handle all these data. In this paper, we present new tools that will enable efficient, flexible and robust map merging. Assuming that separate optimisations have been performed for the individual maps, we show how only relevant data can be stored in a low memory footprint representation. We use these representations to perform map merging so that the algorithm is invariant to the merging order and independent of the choice of coordinate system. The result is a robust algorithm that can be applied to several maps simultaneously. The result of a merge can also be represented with the same type of low-memory footprint format, which enables further merging and updating of the map in a hierarchical way. Furthermore, the method can perform loop closing and also detect changes in the scene between the capture of the different image sequences. Using both simulated and real data - from both a hand held mobile phone and from a drone - we verify the performance of the proposed method.Comment: Accepted at ICPR2020, 9 pages, 8 figure

arXiv.org e-Print Archive

Lund University Publications

The Multi-view Geometry of Parallel Cylinders

Author: Engman Johanna
Flood Gabrielle
Gillsjö David
Larsson Viktor
Oskarsson Magnus
Tegler Erik
Åström Kalle
Publication venue: Springer
Publication date: 01/01/2023
Field of study

In this paper we study structure from motion problems for parallel cylinders. Using sparse keypoint correspondences is an efficient (and standard) way to solve the structure from motion problem. However, point features are sometimes unavailable and they can be unstable over time and viewing conditions. Instead, we propose a framework based on silhouettes of quadric surfaces, with special emphasis on parallel cylinders. Such structures are quite common, e.g. trees, lampposts, pillars, and furniture legs. Traditionally, the projection of the center lines of such cylinders have been considered and used in computer vision. Here, we demonstrate that the apparent width of the cylinders also contains useful information for structure and motion estimation. We provide mathematical analysis of relative structure and relative motion tensors, which is used to develop a number of minimal solvers for simultaneously estimating camera pose and scene structure from silhouette lines of cylinders. These solvers can be used efficiently in robust estimation schemes, such as RANSAC. We use Sampson-approximation methods for efficient estimation using over-determined data and develop averaging techniques. We also perform synthetic accuracy and robustness tests and evaluate our methods on a number of real-world scenarios

Lund University Publications